Concurrent Discretization of Multiple Attributes

نویسندگان

  • Ke Wang
  • Bing Liu
چکیده

Better decision trees can be learnt by merging continuous values into intervals Merging of values however could introduce incon sistencies to the data or information loss When it is desired to maintain a certain consistency interval mergings in one attribute could disable those in another attribute This interaction raises the issue of determin ing the order of mergings We consider a globally greedy heuristic that selects the best merging from all continuous attributes at each step We present an implementation of the heuristic in which the best merging is determined in a time independent of the number of possible mergings Experiments show that intervals produced by the heuristic lead to im proved decision trees

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discretization Based on Entropy and Multiple Scanning

In this paper we present entropy driven methodology for discretization. Recently, the original entropy based discretization was enhanced by including two options of selecting the best numerical attribute. In one option, Dominant Attribute, an attribute with the smallest conditional entropy of the concept given the attribute is selected for discretization and then the best cut point is determine...

متن کامل

Discretization Numbers for Multiple-Instances Problem in Relational Database

Abstrak Handling numerical data stored in a relational database is different from handling those numerical data stored in a single table due to the multiple occurrences of an individual record in the non-target table and non-determinate relations between tables. Most traditional data mining methods only deal with a single table and discretize columns that contain continuous numbers into nominal...

متن کامل

Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning

Since most real-world applications of classification learning involve continuous-valued attributes, properly addressing the discretization process is an important problem. This paper addresses the use of the entropy minimization heuristic for discretizing the range of a continuous-valued attribute into multiple intervals. We briefly present theoretical evidence for the appropriateness of this h...

متن کامل

Chi2: feature selection and discretization of numeric attributes

Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant attributes. This paper describes Chi2, a simple and general algorithm that uses the 2 statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data, and achieves feature selection via discretization. The empirical results demonstrate that Chi2 i...

متن کامل

An Evolution Strategies Approach to the Simultaneous Discretization of Numeric Attributes

Many data mining and machine learning algorithms require databases in which objects are described by discrete attributes. However, it is very common that the attributes are in the ratio or interval scales. In order to apply these algorithms, the original attributes must be transformed into the nominal or ordinal scale via discretization. An appropriate transformation is crucial because of the l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998